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fl^ , Abstract. A network's topology information can be given as an adja- 

cency matrix. The bitmap of sorted adjacency matrix (BOSAM) is a 
network visualisation tool which can emphasise different network struc- 
^^ . tures by just looking at reordered adjacent matrixes. A BOSAM picture 

^^ ' resembles the shape of a flower and is characterised by a series of 'leaves'. 

Here we show and mathematically prove that for most networks, there 
ri ■ is a self-similar relation between the envelope of the BOSAM leaves. 

O l' This self-similar property allows us to use a single envelope to predict 

all other envelopes and therefore reconstruct the outline of a network's 

BOSAM picture. We analogise the BOSAM envelope to human's finger- 

^ ' print as they share a number of common features, e.g. both are simple, 

jy! , easy to obtain, and strongly characteristic encoding essential information 

^ ' for identification. 

•i-H " 

^.,^. Key words: complex network, mixing patterns, visualisation, BOSAM 

1 Introduction 

m ' 

K*" I During the last decade there has been an international effort to understand the 

structure and dynamics of complex networks in social, biological, and technol- 
ogy systems [Tl[l[31lll[31[ni[71[Hl[ni[ini[IIl[Il]- These networks are very large, 



f~^ ■ containing thousands or even millions of entities (nodes) interacting with each 



other (links), and their structures are irregular, evolving and inherently stochas- 
f-^ ■ tic. The statistical physics methods have been widely used in studying complex 

00 ! networks. 

^^ ■ Complementary to this effort, a number of network visualisation tools have 

been proposed to illustrate network topologies, such as [131 HH HI] ■ These tech- 
niques take the advantage of human being's extraordinary ability in recognising 
patterns in images and therefore allow us to compare two networks by seeing 
whether the networks visualisations look similar to each other. 

Of our particular interest is a tool called the bitmap of sorted adjacency 
matrix (BOSAM) [T3]. It sorts a network's nodes in a specific order such that 
the bitmap representation of the reordered adjacency matrix resembles a 'flower'. 
The shape of the flower reveals many topological properties of the network. A 
BOSAM flower consists of a series of 'leaves', each of which is characterised by 
its envelope. 
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In this paper we demonstrate and mathematically prove that the way the 
adjacency matrix is reordered for BOSAM gives rise to a self-similar relation 
between the envelopes of the leaves. This self-similar property allows us to use 
just one envelope to predict all other envelopes and therefore recover the shape 
of a BOSAM flower. We also show that if a network preserves its macroscopic 
structure during the network growth, the BOSAM envelopes scale with the net- 
work's size. We remark that an envelope of a network's BOSAM is analogous to 
a fingerprint of a human being, which is a small token, easy to obtain, valid for 
life, and encodes essential information for identification. 



2 Bitmap Of Sorted Adjacency Matrix 

For a network with N nodes, the connectivity information between the nodes can 
be given as an A^ * A'^ adjacency matrix, in which entry aij is the number of links 
connecting between nodes with indexes i,j £ {1, 2, ..., A^}. For an undirected, 
simple network (no self- loop or repeat link), the adjacency matrix becomes a 
symmetric (0, l)-matrix with zeros on its diagonal, where entry Uij is mirrored 
by entry Uji. This matrix can be represented as a black-and-white bitmap, i.e. if 
Uij = 1, a black pixel is placed at the coordinate of (i, j); otherwise a white pixel 
is placed there. One can see such a bitmap is not very helpful if node indexes 
are randomly assigned. 

Degree k is defined as the number of links a node has. For a given network, we 
sort nodes in ascending order of the degree. For nodes having the same degree, 
we arrange them in ascending order of the largest neighbor degree, w, which is 
the largest degree of a node's neighbours. For nodes having both the same degree 
and the same largest neighbor degree, we reorder them in ascending order of the 
largest neighbor index, e, which is the largest index of a node's neighbours. We 
then reassign each node a new index using the node's position in the sorted list. 
Then, for two nodes with indexes i < j, we have one of the foUowings: 

- /Cj <C kj 5 

- ki — kj and w^ < ujj; 

- ki — kj, uji = ujj and et < Cj. 

The above node sorting rule produces a reordered adjacency matrix, whose 
bitmap visualisation is called the bitmap of sorted adjacency matrix (BOSAM). 
Fig. [1] shows BOSAMs for six networks. The names and sources of the datasets 
for these networks are given in Table 1. Each BOSAM picture resembles a 'flower' 
which consists of a series of 'leaves' symmetrically arranged along the bitmap's 
diagonal. The shape of the flower reflects a number of network topological prop- 
erties. For simplicity, in the following we only discuss vertical leaves above the 
diagonal. 

2.1 Degree distribution 

The degree distribution P{k) is the probability of finding a fc-degree node in a 
network. For a network with A^ nodes, the number of k-degiee nodes is Nk — 
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Table 1. Properties of networks under study: (a) the Erdos-Renyi (ER) model [25| 
which generates random networks having a Poisson degree distribution; (b) the 
Barabasi-Albert (BA) model [3] which generates scale-free networks with a power- 
law degree distribution; (c) the Positive- Feedback Preference (PFP) model [211I1Z1I1H] 
which generates Internet-like networks; (d) the scientific collaboration network [29II30J . 
in which nodes represent scientists and a connection exists if they coauthored at least 
one paper in the e-print archive http://xxx.lanl.gov/archive/cond-mat/ from 1995 to 
1998; (e) the protein interaction network [6l|3T], in which nodes represent proteins in 
the yeast Saccharomyces cerevisiae (http://dip.doe-mbi.ucla.edu/) and a connection 
exists if they interact with each other; (f ) the Internet network at the autonomous 
system (AS) level [101 [22] based on the traceroute data collected by CAIDA in April 
2003 [321I33[ : and (g, h) the Internet AS networks based on the BGP data collected by 
the Oregon Route Views project (http://www.routeviews.org/) in October 2001 and 
September 2006, respectively. On the AS Internet, nodes represent Internet service 
providers and a connection exists if they have a commercial agreement to exchange 
traffic. The shown properties are: the number of nodes A'^ and links L, the average 
node degree (fe) = 2L/N, and the characteristic node degree k* which has the largest 
degree distribution. 
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(a) ER network 


10,000 


30,000 


6 


6 


(b) BA network 


10,000 


30,000 


6 


3 


(c) PFP network 


10,000 


30,000 


6 


2 


(d) Scientific collaboration 


15,179 


43,011 


5.7 


2 


(e) Protein interaction 


4,713 


14,846 


6.3 


1 


(f) Internet (traceroute) 


9,204 


28,959 


6.3 


2 


(g) Internet (BGP-2001) 


12,033 


21,742 


3.6 


1 


(h) Internet (BGP-2006) 


23,480 


49,077 


4.2 


1 



N ■ P{k), and the number of nodes with degrees equal to or smaller than k as 
N^k — N 'J2i P{^)- According to the node sorting rule of BOS AM, the indexes 
of fc-degree nodes are Ik — {i\ A^^fc-i <i ^ A^^fc}- 

Each leaf is associated vifith a node degree k because it is formed by pixels 
representing connections linking to fc-degree nodes. In other words, the leaf for 
degree k represents entries {a^.j = 1 1 i e /fc} in the reordered adjacency matrix. 
Thus the width of the fc-degree leaf is Nk ■ 

In Fig.[l] the widest leaf in the ER network's BOSAM is for degree 6, which 
reflects that the network has a Poisson degree distribution which peaks at the 
average node degree of 6. Other networks are 'scale-free' having a power-law 
degree distribution [3]. This means most nodes are low-degree nodes, whereas a 
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small number of nodes have very large degrees. This is reflected on BOSAMs as 
the width of leaves decreases rapidly with the node degree. 




1, " 5 ■ 








rik 6k 7k flk gk 



Fig. 1. Bitmap of sorted adjacency matrix (BOSAM) for (a) the ER model, (b) the BA 
model, (c) the PFP model, (d) the scientific collaborations, (e) the protein interactions, 
and (f) the AS Internet. For each network, the envelope which is used as the root for 
the prediction is shown in blue colour and the predicted envelopes are shown in red 
colour. 
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2.2 Degree-degree correlation 

Degree-degree correlation is a widely studied property [111 E]. The protein inter- 
action network, the Internet and the PFP network have a negative degree-degree 
correlation, or so-called disassortative mixing |18j , which means low-degree nodes 
tend to connect with high-degree nodes and vice versa. This is reflected on their 
BOSAMs in Fig. [T] as pixels are densely distributed along the upper envelope of 
the leaves. In contrast, the scientific collaboration network exhibits a positive 
degree-degree correlation, or assortative mixing 19 , which means nodes tend to 
connect with alike nodes of similar degrees. This is characterised on the BOSAM 
as a series of lines are radiated from the top-right corner across the leaves. The 
ER network and the BA network have a neutral degree-degree correlation, which 
is illustrated on the BOSAMs as pixels are fairly evenly distributed on the leaves. 

2.3 Rich-club 

In the Internet and the PFP model, the high-degree nodes, 'rich' nodes, are 
tightly interconnected with themselves, forming a rich-club [20l[2T]. This is re- 
flected on their BOSAMs as the top-right corner is almost fully covered by pixels. 
This is not the case for the ER network and the BA network where high-degree 
nodes are sparsely interconnected with themselves. 

In summary, BOSAM provides a simple and effective way to emphasising 
different network structures. We can compare network topologies by just looking 
at their BOSAMs. For example one can see that although the BA model has 
been widely used as a generic model for all scale-free networks, the model does 
not closely resemble the Internet, the protein interaction and the scientific col- 
laborations. In fact the three real networks themselves are different from each 
other in profound ways. The PFP model well resembles the Internet based on 
the traceroute data (see Table 1). 



3 Self-similar property of BOSAM 

One improvement to the previous version of BOSAM [14] is that here we consider 
the largest neighbor index e as well in the node sorting rule. This reduces zigzag 
in BOSAM and as a result the envelopes of the leaves become smooth, solid 
curves. 

The envelope of the leaf for degree k consists of Nk pixels given as 

Ek = {{i,ei)\i£lk}, (1) 

where Ci is the largest neighbor index of node i and Ik is the set of indexes of 
/c-degree nodes. According to the node sorting rule of BOSAM (see Section [2]), 
the degree of node et is the largest neighbor degree of node z, i.e. /cg. — uJi. One 
can see that the envelope E^ is given by the cumulative distribution function 
Fkitij), which is the probability for a /c-degree node having the neighbours largest 
degree less than or equal to u. 
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3.1 Self-similar relation between BOSAM envelopes 

Theorem 1. In a network, if a node's neighbours degree ^ is independent and 
identically distributed ('i.i.d.J, the cumulative distribution functions Fk{u>) and 
Fi(ijj) for k-degree nodes and l-degree nodes, respectively, have the following self- 
similar relation, 

Fk{oj)=Fi{Luf/K (2) 

The proof of Theorem 1 is given in Appendix I. This self-similar property 
of BOSAM allows us to use just one envelope, we call it the root envelope, to 
predict all other envelopes (see the self-similar algorithm in Appendix III). 

The characteristic degree k* is the node degree having the largest number of 
nodes, i.e. Nk' ^ Nk or P{k*) ^ P{k). The envelope of the leaf for the charac- 
teristic degree contains more information than other envelopes. Fig. [1] illustrates 
the prediction result. For each network, we use the envelope for the characteristic 
degree (see Table 1) as the root envelope. We highlight the root envelope in blue 
colour and the predicted envelopes for other degrees in red colour. One can see 
that the predicted envelopes well overlap with the real envelopes beneath them. 

3.2 Discussion 

The self-similar relation between BOSAM envelopes is different from other scal- 
ing properties in networks, such as the scaling property of community size in 
social networks [34]. As shown in the proof of Theorem 1, the self-similar relation 
between BOSAM envelopes is originated from the way we reorder the adjacency 
matrix, and therefore it is valid for all networks, regardless of networks degree 
distribution or degree-degree correlation. 

The only condition for the proof of Theorem 1 is that the neighbours degree 
H is independent and identically distributed (i.i.d.). One should not confuse this 
condition with the degree-degree correlation property of a network. The i.i.d. 
condition means that a network's degree-degree correlation (whether the corre- 
lation is negative or positive) is consistent for all nodes. We can infer whether 
a network is i.i.d. by observing whether the prediction of BOSAM envelopes is 
accurate. Fig. 1 shows that most networks under study satisfy the i.i.d. condi- 
tion. By comparison, the predicted envelopes for the protein interaction network 
do not precisely (but still quite closely) match the real envelopes. This suggest 
that the protein network are not strictly i.i.d.. 

3.3 Scaling of BOSAM envelopes 

If two networks have the same macroscopic structure, their BOSAM pictures 
should look the same. This should be the case even if the networks are of dif- 
ferent sizes. For example it is known that the BA model preserves its topolog- 
ical structure during network growth [3J. Therefore, as shown in Fig.[2]-(a), the 
BOSAM pictures for the three BA networks with different sizes indeed look the 
same. We can use the scaling algorithm in Appendix IV to accurately predict 
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the envelopes in the two larger networks (in red colour) from the envelopes in 
the small network (which themselves are predicted from the root envelope in 
blue colour). Thus the scaling property of BOSAM envelopes can be used to test 
whether two networks with different sizes have the same macroscopic structure. 
Fig.[2]-(b) shows the BOSAM pictures for the Internet networks based on the 
BGP data collected in 2001 and 2006 (see Table 1). The envelopes in the large 
network are precisely predicted by scaling the envelopes in the small network. 
This suggests that during the 5-year period, although the Internet doubled its 
size, it well preserves its macroscopic structure. 



4 BOSAM envelope as network fingerprint 

Based on the above observations, we remark the analog between BOSAM en- 
velope and human being's fingerprint in the following ways. (1) A BOSAM 
envelope is a small token of the network's adjacency matrix, in the form of a set 
of coordinates Ek = {{i, ei)\i S Ik}- Such a relatively small amount of informa- 
tion is easy to obtain, store and process. (2) A single BOSAM envelope is able to 
recover all other envelopes and thus provide an outline description of a network's 
BOSAM. (3) The envelope fingerprint is valid for a growing network as far as 
the network preserves its macroscopic structure, just like a person's fingerprint 
is valid for life. And (4) A BOSAM envelope contains essential information that 
characterises the network's topology. 

Fig. [3] shows one envelope fingerprint for each of the networks under study. 
For comparison purpose, all the envelopes shown are of the leaves for the node 
degree 2, except for the BA network which does not contain 2-degree nodes and 
therefore the envelope for degree 3 is shown instead. The size of the envelopes 
are normalised by the number of nodes in the networks. We can see the envelope 
fingerprint of the networks are strongly characteristic. 

Fig.O shows two fingerprints for Internet networks based on different data 
sources [21] ■ The fingerprint for the traceroute Internet (line 6) is positioned 
to the left of that for the BGP Internet (line 7). This reflects one of the key 
differences between the two data sources that 1-degree nodes count for a larger 
proportion in the BGP data than in the traceroute data. However the two fin- 
gerprints have the same shape, which suggests the fact that the macroscopic 
structure of the two Internet networks are similar. The close approximation be- 
tween the traceroute Internet and the PFP network (line 3) is evidently shown 
by the close match of their fingerprints. One would expect that minor revision 
would enable the PFP model to resemble the BGP Internet as well. 



5 Conclusion 

BOSAM is a visualisation tool for network topologies. The simple tool provides 
an effective way to emphasise networks topological differences or similarities by 
just looking at the bitmap of reordered adjacency matrixes. 
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Fig. 2. Prediction of BOSAM envelopes for growing networks, (a) BA networks with 
the number of nodes AT = 5, 000, N = 10, 000 and N = 20, 000. (b) Internet networks 
based on BGP data collected in year 2001 and 2006 (see Table 1). In (a) and (b), 
red envelopes in larger networks are predicted from the blue envelope in the smallest 
network. 
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Fig. 3. BOSAM fingerprint for all networks under study. Each envelope is normalised 
by the number of nodes in its own network. 



A network's BOSAM is characterised by a series of leaves and the shape 
of the leaves are described by their envelopes. We show there is a self-similar 
relation between the envelopes for most networks. This properties allow us to 
use one single envelope to reconstruct all envelopes. For an evolving network 
which preserves its structure, the BOSAM envelopes scale with the growing size 
of the network. In these respects we suggest that the BOSAM envelope can be 
used as a self-similar fingerprint for network topologies. 
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APPENDIX I Proof of Theorem 1 

For a fc-degree node, the neighbours degree /i consists of k variants fii, /i2, ..., ^fc . 
We reorder them so that i/i ^ 1^2 ^ ■■■ ^ t^k- Thus the largest neighbor degree 

According to the order statistic theory (see Appendix II ) , the probability 
function of lu for fc-degree nodes is given by 

Pkicu) ^Pk{vk)^k [Fk{^Ji)i''-^^ Pkifi) (3) 

where Pk {^) is the probability function of /i and F^ (fJ-) is the cumulative distri- 
bution function of /i, for fc-degree nodes. 

In a network where /i is i.i.d., we have PkifJ-) = P{fJ-) f^nd Fk{fJ-) — F(fi). 
Then 

Pk{Lo)^k[F{p)](''-'^P{p). (4) 

Therefore we have 

FfcH= / Pk{cj)du; (5) 

Jo 

'\[F{^Ji)](^-^^p{^Ji)d^^ (6) 



'fc[f(^)](^-i)^d^ (7) 

d^i 

/ k[F{^,)]^^-'UF{p) (8) 



F(p) 







{\F{^,)f)' dFif,) (9) 



= [f (m)]'- (10) 

Similarly we can have Fi{lli) = [i^(/i)]' for ^-degree nodes. Thus Theorem 1 is 
proved: 

F.ico) = [Fip)]'' = [ [Fi^,)]^ f/' = mco)]"/'. (11) 



APPENDIX II Order statistic 

Given a sample of n variants Xi, X2..., Xn, reorder them so that li ^ I2 ^ 
... < Fjv. Then Yr is called the r *'' order statistic [23] for r = 1, 2, ...,N. 

If X has the probability function P{X) and the cumulative distribution func- 
tion (CDF) F{X), then the probability function of Yj- is given by [24] 

^(>^-) = (, _ i)!^jv - r)! t^^'^)]''" [^ ' F{Xr-^PiX). (12) 

Therefore the probability function of the largest value Yjy is 

P{Yn) = N [^(X)]^-i P{X). (13) 
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APPENDIX III Algorithm for predicting envelopes for 
other degrees 

For a network with N nodes and degree distribution P{k), if we know the 
BOSAM envelope Ei = {(i, ei)\i G /;} for degree /, then we can use Theo- 
rem 1 to predict the envelope Ek for degree k as 

Ek = {{i,e,)\ieli}, (14) 



where 



and 



''I 
^ = N^k + Nt.\'—^\ ' (15) 



Nk^N-P{k), N<k^N-Y!l 'P{k). 



APPENDIX IV Algorithm for scaling envelopes with 
network size 

For a network with A^ nodes, if we know the BOSAM envelope Ek = {{i, ei)\i G 
Ik} for degree fc, the scaling of the envelope to the network size of A^' is given 

by 

E',^{{i\e';)\telk}, (16) 

where 

and 

N' 



